NMR method for differentiating complex mixtures

ABSTRACT

A method for differentiating complex mixtures each having one or more chemical species is provided. The method comprises producing a sample NMR spectrum by subjecting a mixture to a selective spectroscopy process, wherein the NMR spectrum has individual spectral peaks representative of the one or more chemical species within the mixture. The one or more chemical species within the mixture are identified by analyzing the individual spectral peaks, and the individual spectral peaks are then subjected to a multivariate statistical analysis.

RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent ApplicationSer. No. 60/712,786 filed Aug. 31, 2005, the disclosure of which isexpressly incorporated herein by reference.

This invention was made with government support under grant referencenumber NIH/NIDDK 3R21 DK070290-01 awarded by the NIH Roadmap Initiativeon Metabolomics Technology. The Government has or may have certainrights in the invention.

TECHNICAL FIELD

The present invention is directed toward high-resolution NMR analysis ofchemical structures, and more particularly to the use of selective totalcorrelation spectroscopy (“TOCSY”) to quantify and analyze apredetermined set of chemical species.

BACKGROUND OF THE INVENTION

It is well known that nuclear magnetic resonance (NMR) spectroscopyprovides extremely highly detailed information on molecular structure.NMR is also quantitative because the detected signal is linearlyproportional to the absolute number of active nuclei in the detectedsample volume. Thus, relative numbers of hydrogen, carbon or other atomsin a molecule can be directly measured, the relative number of differentmolecular species in a mixture can be computed, and by using an internalstandard (or even an external standard), the absolute concentration ofspecies can be calculated.

However, when measuring the components of complex mixtures, overlappingresonances often result, which thus compromises the ability to measureconcentrations quantitatively. Even small molecules often give rise to20 or more spectral lines in the ¹H NMR spectrum, leading to severeoverlap for many complex mixtures. For example in the ¹H NMR spectrum ofhuman urine, over 1000 spectral lines can be at least partiallyresolved, corresponding to upwards of 100 compounds (See: J. C. Lindon,E. Holmes, and J. K. Nicholson, Prog. NMR Spec. 39, 1 (2001)).

The metabolomics approach, combining high-resolution NMR withmultivariate statistical analysis, has been shown to be very powerfulfor distinguishing biofluid sample subpopulations based on subtledifferences in the their spectra.^(1,2) This approach can be widelyapplied to many types of samples, including urine, body fluids, andtissues. NMR based approaches are attractive because they can look atessentially all of the components of a mixture simultaneously, and thusavoid the sometimes difficult process of sample fractionation. Thesemethods can also be rapid and quantitative.

The present invention is intended to address one or more of the problemsdiscussed above.

SUMMARY OF THE INVENTION

The present teachings are directed to a method for differentiatingcomplex mixtures each having one or more chemical species. The methodcomprises producing a sample NMR spectrum by subjecting a mixture to aselective spectroscopy process, wherein the NMR spectrum has individualspectral peaks representative of the one or more chemical species withinthe mixture. The one or more chemical species within the mixture areidentified by analyzing the individual spectral peaks, and theindividual spectral peaks are then subjected to a multivariatestatistical analysis.

In another aspect of the present invention, a method for quantifying oneor more chemical species within a complex mixture is provided. Themethod comprises subjecting a first mixture to a total correlationspectroscopy analysis to produce a first spectrum composed of individualspectral peaks representative of the one or more chemical species withinthe first mixture. A second spectrum is acquired from an isolatedstandard sample from a second mixture, the second spectrum beingproduced by subjecting the second mixture to a second total correlationspectroscopy analysis. The first spectrum is then compared to the secondspectrum to quantify the one or more chemical species within the firstmixture.

In yet another aspect of the present invention, a method for quantifyingone or more chemical species within a complex mixture is provided. Themethod comprises subjecting the mixture to a first total correlationspectroscopy analysis to produce a first spectrum composed of individualspectral peaks representative of the one or more chemical species withinthe first mixture. A second spectrum is acquired from an isolatedstandard sample within the mixture, the second spectrum being producedby subjecting the mixture to a second total correlation spectroscopyanalysis. The first spectrum is then compared to the second spectrum toquantify the one or more chemical species within the mixture.

The attached claims recite at least some of the novel aspects of thepresent teachings. Other advantages may well be apparent to one of skillin the art upon consideration of the description of the invention andclaims contained herein.

BRIEF DESCRIPTION OF THE DRAWINGS

The above-mentioned aspects of the present teachings and the manner ofobtaining them will become more apparent and the teachings will bebetter understood by reference to the following description of theembodiments taken in conjunction with the accompanying drawings,wherein:

FIG. 1 shows a representative selective TOCSY pulse sequence, whereinthe duration of the shaped pulse, designated “SP,” was varied to producethe data presented in FIGS. 2 and 3;

FIG. 2 shows the effect of shaped pulse duration, “SP,” on theselectivity of 1D TOCSY experiment. (A) 1D proton spectrum of a mixtureof 10 mM L-Proline and 10 mM L-Arginine in pH 7 phosphate buffer and 10%D₂O. Spectra were taken using 1D NOESY Presat sequence for watersuppression. (B) Selective TOCSY spectrum of this sample with selectiveexcitation frequency set on the proline y peak at 2.0 ppm (*) and SP=40ms. (C) Same experiment as shown in B, but with SP=10 ms;

FIG. 3 shows the effect of shaped pulse duration, “SP,” on the signal tonoise ratio of the TOCSY peaks produced in the selective TOCSYexperiment. All spectra were taken on the Proline-Arginine mixturedescribed in FIG. 2. (A) On resonance irradiation of isolated targetpeaks, (●) S/N of proline β TOCSY peak (2.37 ppm) produced by selectiveirradiation of the proline α peak (4.15 ppm), (▪) S/N of arginine αTOCSY peak (3.77 ppm) produced by selective irradiation of the arginineδ peak (3.25 ppm). (B) Effects associated with off resonance irradiationof nearby peaks, (♦) S/N of proline α TOCSY peak (4.15 ppm) produced byselective irradiation centered on the proline γ peak (2.00 ppm), (▴) S/Nof arginine a TOCSY peak (3.77 ppm) produced by selective irradiationcentered on the proline γ peak (2.00 ppm);

FIG. 4 shows (A) Proton NMR spectrum of rat urine acquired using the 1Dpresat NOESY sequence to achieve water suppression. (B) Low fieldexpansion of the rat urine proton NMR spectrum. (C) Selective TOCSY ofrat urine with the selective pulse frequency set on the hippurate 7.88ppm peak (*), and acquired with SP=10 ms;

FIG. 5 shows (A) High field expansion of the proton NMR spectrum ofhuman urine spiked with 1 mM each of leucine, isoleucine and valine.Spectrum was acquired using the 1D presat NOESY sequence to achievewater suppression. (B) High field expansion of the selective TOCSYspectrum of the spiked human urine sample from A with the selectivepulse frequency set on the leucine-isoleucine-valine methyl peaks around1.00 ppm (*), and acquired with SP=10 ms. (C) Selective TOCSY spectrumof a 10 mM solution of leucine with the selective pulse frequency set onthe methyl peak around 1.00 ppm (*), and acquired with SP=10 ms. TheC-alpha peaks appear somewhat low in intensity due to the poorefficiency of the TOCSY mixing cycle in this case;

FIG. 6 shows (A) High field expansion of proton NMR spectrum of humanurine. (B) High field expansion of the proton NMR spectrum of humanurine sample from A spiked with 250 μM isoleucine. (C) High fieldexpansion of semiselective TOCSY spectrum of human urine from A. (D)High field expansion of semiselective TOCSY spectrum of human urinesample from A spiked with 250 μM isoleucine. Both TOCSY spectra weretaken with SP=10 msec centered at a frequency of 1.00 ppm;

FIG. 7 shows PC1 vs. PC2 score plots from the isoleucine spiking studyof human urine. (A) Score plot from PCA calculated using 298 bins of thesemiselective TOCSY spectra as data inputs (1.2 to 4.2 ppm data withselective excitation at 1.00 ppm). (B) Score plot from PCA calculatedusing 298 bins of the 1D proton spectra as data inputs (1.2 to 4.2 ppm).(C) Score plot from PCA calculated using the 57 bins of the 1D protonspectra containing isoleucine peaks (exclusive of the methyl peaks near1.00 ppm) ▴=isoleucine spiked, ♦=control. PC1 and PC2 account for 56.3%of the total variance in (A), 90.0% in (B) and 93.4% in (C);

FIG. 8 shows (A) High field expansion of proton NMR spectrum of humanurine. (B) High field expansion of the proton NMR spectrum of humanurine sample from A spiked with 250 μM isoleucine. (C) High fieldexpansion of semiselective TOCSY spectrum of human urine sample from A.(D) High field expansion of semiselective TOCSY spectrum of human urinesample from A spiked with 250 μM isoleucine. Both TOCSY spectra weretaken with SP=10 msec centered at a frequency of 1.00 ppm (*); and

FIG. 9 shows PC1 vs PC2 score plots from an isoleucine spiking study ofhuman urine. (A) Score plot from PCA calculated using 398 bins of thesemiselective TOCSY spectra as data inputs (0.2 to 4.2 ppm data withselective excitation at 1.00 ppm). (B) Score plot from PCA calculatedusing 398 bins of the 1D proton spectra as data inputs (0.2 to 4.2 ppmdata). (C) Score plot from PCA calculated using the 129 bins of the 1Dproton spectra containing isoleucine peaks (including methyl peaks at0.90 to 1.1 ppm) ▴=isoleucine spiked, ♦=control.

Corresponding reference characters indicate corresponding partsthroughout the several views.

DETAILED DESCRIPTION

The embodiments of the present teachings described below are notintended to be exhaustive or to limit the teachings to the precise formsdisclosed in the following detailed description. Rather, the embodimentsare chosen and described so that others skilled in the art mayappreciate and understand the principles and practices of the presentteachings.

An experimental selective total correlation spectroscopy (TOCSY) methodfor quantifying several chemical species of honey is described herein(see also a recently published study³ which was authored by the presentinventors and is hereby incorporated by reference in its entirety). Ithas been discovered that this TOCSY method is a useful alternative tothe standard metabonomic analysis of biofluids. According to thismethod, a selective excitation pulse and TOCSY mixing period were usedto focus the statistical analysis of a few pre-selected components inhoney, such as amino acids for instance. Through this analysis, it wasdiscovered that the discrimination of subpopulations in a set of samplesis substantially improved, particularly as the signals used, come almostexclusively from components that vary significantly between samples. Oneaspect of this method is that it facilitates the accurate quantificationof a predetermined set of chemical species, regardless of whether thesespecies are major or minor components of the mixture. As such, a set ofchemical compounds to be studied in a metabonomic analysis may then bechosen based on their metabolic or pharmacological significance. Forinstance, a specific subset of chemical compounds present in a biofluidmay be chosen for study because they are known to be metabolicallyrelated.

The present methods are capable of differentiating complex and largelysimilar mixtures by enhancing the quantitative measurement of minorcomponents using NMR spectroscopy. Determination of the concentration ofthese species can be used in one of a number of multivariate statisticalanalyses to differentiate similar but complex mixtures, such as thosefound in biofluids and/or other liquids. To achieve this, the presentmethods involve a combination of advanced NMR methods with multivariatestatistical correlations, such as the Pearson product moment correlationtest, unsupervised multivariate statistical analyses, such as theprincipal component analysis (“PCA”), as well as supervised multivariatestatistical analyses, such as the orthogonal-partial leastsquares-discriminate analysis (“O-PLS-DA”). Moreover, the methods arecapable of detecting low concentration species, as well as analyzing awide range of mixtures, including biofluids such as blood, urine, spinalfluid, etc., liquid foods, chemical feedstocks, such as petroleum, andso forth, where different classes of molecules are present producingcomplicated, overlapping spectral features. The methods may also be usedto select one or more molecules in a mixture, simplify their NMRspectra, increase their detection sensitivity, allow for quantitativeevaluation of those molecules, and to differentiate mixtures that differin the concentration of these molecular species that may be minorcomponents in the mixtures. Additionally, the methods may also be usedto differentiate sick and healthy patient samples by focusing on lipids,sugars, amino acids or other such metabolites.

The present methods enhance the ability of NMR to differentiate complexmixtures, as well as cause selective excitation of certain nuclear spinsand nuclear spin polarization transfer to other nuclear spins on thesame molecule. The methods are also capable of identifying andquantitating molecular components by simplifying the NMR spectra of themixture. This approach can be used to select a certain molecularspecies, or several species, simplify their NMR spectra, increase theirdetection sensitivity in the presence of a complicating matrix, andallow for quantitative evaluation of these selected molecules. Theconcentrations, or more typically the NMR spectra of these species, arethen subjected to multivariate statistical analysis, such as principlecomponent analysis to allow differentiation of the samples. Many typesof multivariate statistical analysis can be applied once the spectra aresimplified by selective excitation. While other NMR processes areavailable, such as LC-NMR for instance, the present TOCSY processes havesignificant unique advantages over these existing methods. For instance,the present methods are more rapid, since the selective TOCSYexperiments require very little time or effort for sample preparation,and they avoid any possibility of differential sample fractionation,particularly since no physical separation of the mixture components isinvolved.

The use of quantitative selective excitation (selective) TOCSY, andmultivariate statistical analysis can be very useful to differentiateotherwise very similar samples. For instance, in the above-referencedpublication, the inventors differentiated 8 different honey samplesbased on the concentrations of their amino acid content. Theconcentrations of these amino acids are typically 200 times less thanthe major components, α-glucose, fructose, other sugars and water.

One challenge in using processes such as selective TOCSY to detectsingle molecular species within complex mixtures is that such processescan produce the simultaneous excitation of several molecular spinsystems at once. When this happens, problems with the purity of theindividual TOCSY peaks observed and/or with their assignment intospecific spin systems can occur. While it is in principle possible touse very selective excitation approaches in order to address thisproblem, unfortunately in most cases, greater spin system selectivitycan only be gained at the expense of sensitivity. This is anunacceptable trade-off when dealing with biofluid samples. To eliminatethis challenge, the present inventors have discovered an alternativetwo-stage modification process to the basic selective TOCSY system. Atan initial stage, using a less selective excitation in the TOCSY pulsesequence optimizes the sensitivity and data collection efficiency of theexperiment, at the expense of spin system selectivity. At a secondstage, application of the Pearson product moment correlation coefficientmethod to the TOCSY peak integral intensities provides a test forindividual TOCSY peak purity, and allows for the assignment of the peaksinto spin systems.

Another known challenge of NMR and metabolomics analysis approaches isthat in many biofluids of interest (such as urine and blood serum), onlya fraction of the NMR spectral features of the different chemicalspecies are capable of being resolved. For instance, the selective TOCSYprocess can behave “semiselectively” when applied to these mixtures,wherein a single selective TOCSY spectrum will very often contain peaksfrom several different chemical species. Using human and rat urine asexamples of typical biofluid samples, the present inventors examined andcompared two different solutions to this problem. Longer shaped pulsedurations in the selective TOCSY pulse sequence were found to narrow theselective excitation band, thus focusing the experiment more selectivelyon individual chemical species. However, increasing the shaped pulseduration resulted in a significant decrease in the intensity of theTOCSY peaks, which also affects the sensitivity. Alternatively, it wasdiscovered that relatively short excitation pulse durations can be usedto produce a spectrum composed of more intense TOCSY peaks. This resultsin a “semiselective” TOCSY spectrum in which the TOCSY peaks will, ingeneral, derive from several different chemical species. While initiallyproblems may be encountered when identifying which chemical species arerepresented by a particular TOCSY peak, or whether a given TOCSY peakrepresents any single species, the judicious application of the Pearsonproduct moment correlation coefficient method can be used to resolvethese problems.^(4,5) Statistical correlation methods provide a goodmeans to identify related peaks and even molecules in metabonomicsstudies, as was recently shown by Nicholson and coworkers.⁶ A majorissue in classical metabonomics studies is the discriminating power ofthe method when minor components are the significant varying species.Classical metabonomic studies usually employ complete 1D proton NMRspectra as data inputs for PCA calculations.⁷⁻¹⁴ Because of significantspectral overlap in samples such as urine or serum, the practical limitof detection is relatively high and thus discrimination of similarsamples is challenging. It was found that the using semiselective TOCSYspectra as PCA data inputs is more sensitive and reliable than theclassical metabonomic approach when dealing with small differences incomplex biofluid compositions. In essence the coupling inherent in theNMR spectrum can be used as a filter to lower the threshold of detectionfor discriminating different sample subpopulations.

Additionally, it is possible to “multiplex” the selective excitationprocess by using an approach called Hadamard Transform NMR.²⁰ InHadamard NMR spectroscopy, the peaks of interest are irradiatedselectively using a frequency-domain multiplex scheme. With thisapproach, there is no loss in sensitivity per unit time. Hadamardmatrices are used to determine the multiplexed irradiation frequenciesand are then used to decode the NMR spectra as they are being processed.Most types of NMR experiments can be implemented in this fashion, aslong as the frequencies of the signals of interest are known in advance.Multidimensional NMR experiments can be implemented using Hadamardprinciples, leading to large time savings. This approach then, becauseof its inherently quantitative nature, can be combined with multivariatestatistical analysis such that it can be used to differentiate complexsamples.

Advantages and improvements of the processes of the present inventionare demonstrated in the following examples. The examples areillustrative only and are not intended to limit or preclude otherembodiments of the invention.

EXAMPLES

NMR Samples: L-Amino Acids and TSP (sodium 3-trimethylsilyl (2,2,3,3²H₄) i-propionate) were purchased from Sigma-Aldrich (St. Louis, Mo.)and used without further purification. For NMR analysis, amino acidsolutions were prepared in 50 mM phosphate buffer at pH 7. Human urinewas collected from the volunteers. A sample of urine collected from amale adult Sprague-Dawley rat was the generous donation of Dr. PeterKissenger and Dr. Chester Duda of Bioanalytical Systems, Inc. (WestLafayette. Ind.). For NMR analysis, urine samples were prepared by theaddition of 60 μl of 1 M phosphate buffer, pH 7, to 540 μl of neaturine. For the PCA study, urine samples were collected at three timepoints during the day, and these samples were split into thirds, forming9 samples, 6 of which were spiked with varying concentrations ofisoleucine. All NMR samples were run in 5 mm tubes with 10% added D₂O(Cambridge Isotopes Laboratory, Inc.) and 50 μl TSP.

NMR Spectroscopy: NMR spectra were taken on a Bruker AVANCE DRX 500 MHzspectrometer (Bruker-Biospin, Fremont, Calif.), using a 5 mm inverse HCNtriple resonance probe equipped with XYZ axis gradient coils. Allspectra were acquired at 25° C., and were referenced to the TSP methylpeak at 0.00 ppm. Proton spectra were acquired using a 1D NOESY pulsesequence incorporating presaturation for water suppression during therelaxation delay and mixing time.^(15,16) The relaxation delay andmixing times were set to 2s and 300 ms, respectively, and thepresaturation power used was the minimum needed to effect completesuppression of the water peak. In order to achieve high signal-to-noiseratios for minor components, 64 FID transients were averaged, resultingin a total acquisition time of 7 min. Selective TOCSY experiments usedthe standard pulse sequence found in the Bruker XWINNMR pulse programlibrary (see FIG. 1), and consists of a hard 90° pulse—zgradient—selective 180° pulse—z gradient train to achieve selectiveexcitation of the target peak, followed by a MLEV-17 TOCSY spinlock.¹⁷⁻¹⁹ It should be understood that the pulse sequence shown in FIG.1 illustrates one example of carrying out a selective TOCSY experiment.Many choices exist for the selective pulse and the mixing cycle, and insome experiments, the first pulse can be eliminated. Here,gaussian-shaped pulsed z-field gradients were 1 ms in duration and 14mT/m at maximum strength. Secant-shaped selective 180° pulses were foundto be most effective for selective excitations. The duration of theshaped pulse was varied as described below. TOCSY mixing times were 70ms. Thirty-two 16 K point FID transients were averaged in each selectiveTOCSY experiment, resulting in an acquisition time of 1 min. The“spdisp” utility incorporated in the Bruker XWINNMR software package wasused to determine shaped pulse excitation frequency band widths. 1 Hzline broadening was used in processing the spectra. It should beappreciated and understood that the parameters included herein are forillustrative purposes only, wherein other types of selective pulses,mixing cycles and mixing times may be employed by those skilled in theart while still encompassing the scope of the present teachings. Assuch, the present teachings are not intended to be limiting herein.

Pearson Product Moment Experiments: A series of 64 random numbers with amean value of 1.00 and a standard deviation of 0.25 was generated usingthe Microsoft EXCEL random number utility (Microsoft Corp., Redmond,Wash.). Negative values in the list of random numbers were discarded,and the first 27 values of the remaining numbers were used as random mMconcentrations of leucine, isoleucine and valine added to nine aliquotsof a single human urine sample. In this way, nine NMR samples of humanurine with random concentrations of leucine, isoleucine and valine wereproduced. Selective TOCSY spectra taken on this set of samples wereprocessed and transformed using the same parameters, and the TOCSY peaksover the resulting set of nine spectra were base-line corrected and thenintegrated as a set using the XWINNMR multiple integration macro writtenin-house. The same chemical shift limits were used for all spectralintegrations. The resulting text file containing the relative integralintensities was read into a Microsoft EXCEL spread sheet, and the matrixof numbers used as input data for Pearson product moment correlationcoefficient calculations performed using the EXCEL utility.

PCA Calculations: For the isoleucine spiking PCA study, nine human urinesamples (three controls and six spiked with 250±65 μM isoleucine) wereprepared as described above from 3 different urine samples from the sameindividual. 1D proton and semiselective TOCSY spectra for the nine urinesamples were acquired, transformed and phased using XWINNMR. The realparts of the transformed spectra were converted to XY plot format JCAMPfiles. Each JCAMP file was text edited to remove header text and X data,and then read in as a column into an EXCEL spreadsheet. The 8 K pointsof each spectrum were 4-fold binned to yield a 2 K data column. In thisway two 2 K by nine matrices were constructed, one matrix containing theset of nine 1D proton spectra, and one matrix containing the set of ninesemiselective TOCSY spectra. 298 point segments of these two matrices,corresponding to the 1.2 to 4.2 ppm chemical shift region of thespectra, were used as input data for correlation PCA calculationsperformed in MINITAB 13 (MINITAB Inc., State College, Pa.). The relativevariance contributions of the first two principal components areindicated in the figure captions. In all cases fewer than eightprincipal components were found to be adequate to account for >99.9% ofthe variance.

The Effect of Shaped Pulse Duration: The pulse sequence used in theselective TOCSY experiment is shown in FIG. 1.¹⁷⁻¹⁹ The duration of theshaped pulse, used in this experiment in a 180° refocusing mode, anddenoted as “SP” in FIG. 1, will largely determine the frequency widthexcited by the selective pulse sequence. In general, of course,lengthening the shaped pulse duration will narrow the excitationfrequency band width. This relationship can be quantitatively evaluatedusing the Bruker XWINNMR “spdisp” utility.

A system composed of 10 mM L-arginine and 10 mM L-proline was used toexamine experimentally the effects of varying the selective TOCSYsequence shaped pulse duration (FIG. 2). The results of particularinterest in these experiments are the effects on the TOCSY peakintensities, a consideration of central importance if the selectiveTOCSY experiment is to be used for the quantitative analysis of specificchemical components of a biofluid mixture. In general the resultsobserved when varying the shaped pulse duration will depend on the NMRspectrum of the target chemical compound. However, two general cases canbe described. In cases where the target excitation peak is wellseparated in the spectrum from any other peak of its own spin system,the intensities of the TOCSY peaks will increase with an increase in theshaped pulse excitation frequency band width, until the excitationfrequency band width equals the spectral width of the target peak. Thiscase is illustrated by the two experiments presented in FIG. 3A. Notethat for these examples the target peak width is approximately 20 Hz,corresponding to an optimal shaped-pulse duration of 40 ms.

It should be noted that the simple relationship between shaped pulseduration and TOCSY peak intensity described above is observed only whenthe target excitation peak is well separated in the spectrum from otherpeaks of its own spin system. In the examples presented in FIG. 3Atarget peaks were 350 to 275 Hz away from the nearest neighbor peaks oftheir own spin systems. The second general case that can be describedoccurs when the target excitation peak is close to another peak of itsown spin system. In this case off-resonance excitation of this secondneighboring peak will increasingly occur as the shaped pulse duration isshortened. The effects of this in the selective TOCSY experiment arepresented in FIG. 3B, where the signal to noise ratio of the proline αTOCSY peak is plotted as a function of the shaped pulse duration, withthe shaped pulse excitation frequency centered on the proline V peak(♦). The proline y peak width is approximately 56 Hz, which shouldcorrespond to an optimal selective-pulse duration of 14 ms. However, theproline β2 peak occurs only 13 Hz down field from the proline y peak.Consequently as the shaped pulse duration centered on the proline y peakis shortened below 10 ms dramatic increases in the proline α TOCSY peakintensity are observed, due to the off resonance excitation of theproline β2 peak. Concurrently strong arginine peaks are also observed inthe TOCSY spectrum due to the off-resonance irradiation of arginine ypeak (FIG. 3B-▴). The experiment has become “semiselective.”

When discussing the quantitative results to be expected from theselective TOCSY experiment, it is obvious that each spin system in amixture constitutes its own special case. However, all spin systems showan increase in TOCSY peak intensity when the shaped-pulse duration isshortened. So in terms of sensitivity, it beneficial to use a relativelyshort, or “semiselective,” shaped pulse duration. Since a shortershaped-pulse duration will in general excite more spin systems at thesame time, shorter pulse durations are also more efficient in terms ofsurveying the chemical species present in a biofluid sample.

Use of the Pearson Product Moment Method to Test the Purity of TOCSYPeaks: In the case of urine samples, spectral overlap varies from mildto severe. FIG. 4A shows the proton spectrum of urine collected from anadult male Sprague-Dawley rat. In some cases, the selective TOCSYexperiment can yield relatively pure single component spectra. Anexample of this is rat urine hippurate, the selective TOCSY spectrum ofwhich is shown in FIG. 4C. More typically, excitation of any given peakin the urine spectrum will give rise to a spectrum containing peaks fromseveral different spin systems. An example of this is shown in FIG. 5B,where excitation of the human urine amino acid methyl peak at 1 ppmyields a TOCSY spectrum containing peaks from leucine, isoleucine andvaline. In either case however, regardless of whether a single spinsystem or multiple spin systems are excited, the separate individualTOCSY peaks, if pure, may be used to measure the concentrations of thechemical species present in the mixture.³

With the use of less selective excitation, or in the case of severelyoverlapped spectra in complex mixtures, it becomes more likely that anyresolved TOCSY peak produced will contain contributions from severaldifferent chemical species. For example, it is not clear a priori thatany of the urine sample TOCSY peaks resolved in FIG. 4C or 5B are pure.This of course raises the possibility that a particular TOCSY peak canno longer be accepted as an accurate measure of the concentration of aparticular chemical species.

In addition, since the application of a less selective excitation pulseto a biofluid mixture also generally produces a more complex TOCSYspectrum, containing peaks from several different spin systems, it mayalso become difficult to assign specific peaks to specific chemicalspecies. An example of this is the α-proton region in the human urineTOCSY spectrum shown in FIG. 5B. Here it is clear that the four TOCSYpeaks resolved between 3.7 and 3.8 ppm are amino acid α-proton peaks.However which of the three target amino acids each of the four a peaksbelongs to is unclear.

Fortunately, however, the Pearson product moment correlation coefficientmethod can be used as a statistical test to determine the purity of anyparticular TOCSY peak, and is also useful to help define which peaksbelong to the same spin system. For two independent variables x_(i) andy_(i), measured in sample i over a set of samples, with average values Xand Y, the Pearson product moment correlation coefficient, PM, is givenby:^(4,5)${PM} = {\sum\limits_{i}\quad{\left( {x_{i} - X} \right){\left( {y_{i} - Y} \right)/{\sum\limits_{i}\quad{\sqrt{\left( {x_{i} - X} \right)^{2}}{\sum\limits_{i}\quad\sqrt{\left( {y_{i} - Y} \right)^{2}}}}}}}}$

If x and y are the integral intensities of two TOCSY peaks belonging tothe same spin system, and neither is significantly contaminated by peaksof another spin system, then the peak intensities of the two will behighly correlated when they are measured over a set of samples, and thePM calculated for the two peaks will be close to one. Otherwise, thecorrelation will be significantly diminished.

Table 1 summarizes the results from a set of experiments in whichsemiselective TOCSY measurements were made on a set of 9 human urinesamples generated by spiking a single sample of urine with randomconcentrations of leucine, isoleucine and valine. Most urine sampleswill show widely variable amounts of these amino acids, as well ascertain other unidentified species with peaks occurring near 1.00 ppm.The particular urine sample chosen for these experiments had negligibleamounts of these amino acids before spiking, which allowed for goodexperimental control over the amounts of these three amino acidspresent. A single semiselective TOCSY experiment was performed on eachof the 9 samples, using a 10 ms shaped pulse duration centered on theamino acid methyl peak of 1.00 ppm. The PM values calculated from theintegrated TOCSY peaks in these experiments were 0.88 to 0.99 for peaksbelonging to the same spin systems, and less than 0.57 for peaksbelonging to different spin systems (Table 1). If any peak containedsignificant contamination from a spin system of a second molecule, whichpresumably would not be statistically related over the sample set to thetarget spin system, then its intra spin system PM values would besignificantly reduced. TABLE 1 Pearson Product Moment CorrelationCoefficients for Aliphatic Amino Acid Semiselective TOCSY Peaks Measuredin Human Urine LEU α LEU β, Y VAL α1 VAL α2 VAL β ILE α ILE β ILE v1 ILEv2 LEU α 1.000 0.916 0.234 0.326 0.389 0.397 .424 0.440 0.469 LEU β, y0.916 1.000 0.484 0.554 0.576 0.382 0.292 0.323 0.320 VAL α1 0.234 0.4841.000 0.934 0.886 0.367 0.126 0.152 0.070 VAL α2 0.326 0.554 0.934 1.0000.919 0.204 −0.007 0.056 −0.007 VAL β 0.389 0.576 0.886 0.919 1.0000.479 0.320 0.393 0.326 ILE α 0.397 0.382 0.367 0.204 0.479 1.000 0.9010.918 0.887 ILE β 0.424 0.292 0.126 −0.007 0.320 0.901 1.000 0.967 0.961ILE y¹ 0.440 0.323 0.152 0.056 0.393 0.918 0.967 1.000 0.994 ILE y²0.469 0.320 0.070 −0.007 0.326 0.887 0.961 0.994 1.000

Using the product moment correlation coefficients listed in Table 1, itis possible to make definitive assignments of the amino acid spinsystems, including those peaks in the confusing α-proton region. Thethree amino acids, leucine, isoleucine and valine, are ubiquitous inbiofluids, and have been identified in metabonomic studies onurine,^(7,8) blood plasma,⁹⁻¹¹ aqueous liver extracts,¹⁰⁻¹¹ brainfluid,¹² wine¹³ and beer.¹⁴ However the identification of these threeamino acids in those complex mixtures has often been made based simplyon the observation of proton NMR peaks near 1 ppm. While connectivity orspiking experiments have been used, they are time consuming. Incontrast, the use of the fast 1D semiselective TOCSY experiment incombination with Pearson product moment correlation coefficient analysisclearly defines the unique spectral signature of a complete spin system.Thus the product moment correlation coefficient method can be used bothas a test to confirm the integrity of any given TOCSY peak, and also asa means to identify the spin system and confirm the identity of thechemical species detected from multiple, correlated peaks in thespectra.

A Test of the Sensitivity of Semiselective TOCSY Spectra as Data Inputsfor Metabonomic PCA: An important feature of the selective TOCSYexperiment is the ability to focus the analysis of different samples oncomponents that can be used to draw distinction between subtletydifferent subpopulations in sets of very similar very complex samples.An experiment was performed on human urine samples to test the abilityof semiselective TOCSY to make such subtle distinctions. FIG. 6 presents1D proton NMR and semiselective TOCSY spectra of a control human urinesample, and the same sample spiked with 250 μM isoleucine. It should beclear that addition of 250 μM isoleucine to urine produces only verysubtle, almost undetectable, differences in the 1D proton NMR spectrum,as can be seen by comparing FIGS. 6A and 6B. In contrast, dramaticdifferences are observed in the semiselective TOCSY spectra of thespiked and control samples (FIGS. 6D and 6C).

Intuitively it would seem that the use of semiselective TOCSY spectra asdata inputs for PCA calculations would make metabonomic studies moresensitive to small differences in metabolite concentrations. To testthis idea, three samples of human urine were collected from a singleindividual over the course of a single day. Each of the three sampleswas divided into three aliquots, and two of the aliquots from eachsample were spiked with amounts of isoleucine varying between 185 and315 μM. The exact amount of isoleucine added to each spiked sample wasdetermined from the first six numbers between 185 and 315 occurring in alist generated using the EXCEL random number utility. This samplepreparation procedure generated a set of nine human urine samples ofwhich six were spiked with 250±65 μM isoleucine. 1D proton spectra andsemiselective TOCSY spectra, taken with the selective excitation pulsecentered at 1 ppm, were acquired on each of these samples. The resultingspectra were subjected to correlation PCA calculations using thespectral region from 1.2 to 4.2 ppm. That is, the region of the completeNMR spectrum containing isoleucine peaks, exclusive of the methyl peakat 1 ppm that was used for the selective excitation (see additionaldiscussion below).

The PC1 vs. PC2 PCA score plots shown in FIG. 7 clearly indicate theadvantage of the semiselective TOCSY approach. The score plot calculatedusing the semiselective TOCSY spectra as data inputs shows a cleardiscrimination between the spiked and control samples (FIG. 7 A). Incontrast, the PCA score plot calculated using 1D proton NMR spectra asdata inputs does not discriminate between the isoleucine spiked andcontrol samples (FIG. 7B). Rather, the clustering observed in the PCAscore plot calculated using 1D proton spectra is random, and does notderive either from the origin of the urine nor from the spiking of thesamples. The correct clustering is dramatically improved using theselective TOCSY approach. It is also interesting to note that thediscrimination is largely along PC2. We found that PC1 was in factdominated by (chemical) noise due to the variation of the manycomponents among the different urine samples.

As an alternative, it is possible to use the 1D proton spectra as datainputs but use only on those frequency bins that include the isoleucinesignals. In this third PCA calculation, data inputs were limited to the57 bins of the 1D proton spectra containing the isoleucine α peak, at3.65 to 3.75 ppm, β peak, at 1.95 to 2.05 ppm, γ1 peak, at 1.45 to 1.55ppm, and γ2 peak, at 1.25 to 1.35 ppm. However, this approach also failsto produce any clean score plot discrimination between the spiked andcontrol samples (FIG. 7C).

The most intense peaks of the isoleucine spectra are γ and δ methylpeaks found between 0.90 and 1.1 ppm. These peaks were used as theselective TOCSY target peaks in the isoleucine spiking study, and wereexcluded from the PCA calculations presented in FIG. 7. In factinclusion of these intense methyl peaks in the PCA calculation has verylittle effect on the quality of the spiked vs. control clusteringobserved in the resulting PCA score plots (see FIGS. 8 and 9). Theclustering produced in the selective TOCSY spectra based calculationsremains very good (FIG. 9A), while no actual discriminatory clusteringis produced in the 1D proton based calculations (FIGS. 9B and 9C). Thusthe calculations, both exclusive and inclusive of the isoleucine methylpeaks, demonstrate that the selective TOCSY approach is more sensitiveto small differences in metabolite concentration, or less intensespectral features, than the standard metabonomics approach.

It should be noted that the above procedure, introduction of completeselective TOCSY spectra as PCA data inputs, was performed in thisparticular case only as a test of the greater discriminatory sensitivityof the semiselective TOCSY spectra relative to 1D proton spectra. Inpractice, in an actual metabonomics study, the selective TOCSY peaks fora number of compounds of interest would be integrated, and the integralintensity numbers used as PCA data inputs.³ The use of the peak integralintensities is preferable in that it facilitates more rapid statisticalanalysis, allows for the testing of the TOCSY peaks for purity using thePearson correlation method, and allows for the establishment of thestatistical significance of the different chemical components usingMANOVA.³

The use of semiselective TOCSY spectra as data inputs for PCAcalculations provides a relatively rapid means of distinguishing betweensets of biofluid samples with subtle differences in metaboliteconcentrations. When multiple components are excited by thesemiselective pulse, the Pearson moment correlation coefficient providesa method to distinguish pure TOCSY peaks and to aid in their assignment.Finally, the use of semiselective TOCSY spectra as PCA data inputs ismore sensitive to small differences in metabolite concentrations thanPCA calculations using 1D proton spectra as data inputs. Good separationof spiked and control samples could be made easily at the level of 250μM with the use of selective TOCSY using a 1 min acquisition time.

It should be understood and appreciated that additional externalstandardization processes, such as electronic quantitating methods(e.g., ERETIC), may also be used in addition to and/or in conjunctionwith the selective TOCSY processes of the present teachings. As such,the present teachings are not intended to be limiting in nature herein.

While exemplary embodiments incorporating the principles of the presentteachings have been disclosed hereinabove, the present teachings are notlimited to the disclosed embodiments. Instead, this application isintended to cover any variations, uses, or adaptations of the inventionusing its general principles. Further, this application is intended tocover such departures from the present disclosure as come within knownor customary practice in the art to which this invention pertains andwhich fall within the limits of the appended claims.

References: The following are incorporated herein by reference in theirentirety:

-   (1) Lindon, J. C.; Holmes, E.; Nicholson, J. K. Prog. Nucl. Magn.    Reson. Spec. 2001, 39, 1-40;-   (2) Lindon, J. C.; Holmes, E.; Nicholson, J. K. Prog. Nucl. Magn.    Reson. Spec. 2004, 45, 109-143;-   (3) Sandusky, P. O.; Raftery, D. R. Anal. Chem. 2005, 77, 2455-2463;-   (4) Johnson, R. A.; Wichern, D. W. Applied Multivariate Statistical    Analysis, 4th Edition, Prentice Hall, Upper Saddle River, N.J.,    1998;-   (5) Krzanowski. W. J. Principles of Multivariate Analysis: A User's    Perspective. Revised Edition, Oxford University Press. Oxford, 2000;-   (6) Cloarec, O.; Dumas, M.-E.; Craig, A; Barton, R. H.; Trygg, J.;    Hudson, J.; Blancher, C.; Gauguier, D.; Lindon, J. C.; Holmes, E.;    Nicholson, J. K. Anal. Chem. 2005, 77, 1282-1289;-   (7) Holmes, E.; Nicholls, A W.; Lindon, J. C.; Connor, S. C.;    Connelly, J. C.; Haselden, J. N.; Damment, S. J. P.; Spraul, M.;    Neidig, P.; Nicholson, J. K. Chern. Res. Toxicol. 2000, 13, 471-478;-   (8) Williams, R. E.; Jacobsen, M.; Lock, E. A. Chem. Res. Toxicol.    2003, 16. 1207-1216;-   (9) de Graaf, R. A; Behar, K. L. Anal. Chern. 2003, 75, 2100-2104;-   (10) Waters, N. J.; Holmes, E.; Williams, A; Waterfield, C. J.;    Farrant, R. D.; Nicholson, J. K Chem. Res. Toxicol. 2001, 14,    1401-1412;-   (11) Coen, M; Lenz, E. M.; Nicholson, J. K; Wilson, I. D.; Pognan,    F.; Lindon, J. C. Chem. Res. Toxicol. 2003, 16, 295-303;-   (12) Khandelwai, P; Beyer, C. E.; Un, Q.; Schechter, L. E.; Bach II,    A C. Anal. Chem. 2004, 76, 4123-4127;-   (13) Brescia, M. A; Kosir, I. J.; Caldarola, V.; Kidric, J.; Sacco,    A J. Agric. Food Chem. 2003, 51, 21-26;-   (14) Nord, L. I.; Vaag, P.; Duus J. O. Anal. Chem. 2004, 76,    4790-4798;-   (15) Nicholson, J. K; Foxall, P. J. D. Spraul, M.; Farrant, R. D.;    Lindon, J. C. Anal. Chem. 1995, 67, 793-811;-   (16) Belton, P. S.; Colquhoun, I. J.; Kemsley, E. K; Delgadillo, I.;    Roma, P.; Dennis, M. J.; Sharman, M.; Holmes, E.; Nicholson, J. K.;    Spraul, M. Food Chemistry 1998, 61, 207-213;-   (17) Kessler, H.; Oschkinat, H.; Griesinger, C.; Bermel, W. J. Magn.    Reson. 1986, 70, 106-133;-   (18) Stott, K.; Stonehouse, J.; Keeler, J; Hwang, T.-L.; Shaka,    A J. J. Am. Chem. Soc. 1995, 117, 4199-4200; and-   (19) Bax, A; Davis, D. G. J. Magn. Reson. 1985, 65, 355-360.-   (20) E. Kupce, T. Nishida, and R. Freeman, Prog. NMR Spec. 2003, 42,    95-122.

1. A method for differentiating complex mixtures each having one or more chemical species, comprising: producing a sample NMR spectrum by subjecting a mixture to a selective spectroscopy process, the NMR spectrum having individual spectral peaks representative of the one or more chemical species within the mixture; identifying the one or more chemical species by analyzing the individual spectral peaks within the mixture; and subjecting the individual spectral peaks to a multivariate statistical analysis.
 2. The method of claim 1, wherein the mixture comprises a biofluid mixture.
 3. The method of claim 1, further comprising using the multivariate statistical analysis to determine purity of the individual spectral peaks and to assign the individual spectral peaks into spin systems.
 4. The method of claim 1, wherein the selective spectroscopy analysis is a total correlation spectroscopy analysis.
 5. The method of claim 1, wherein the multivariate statistical analysis comprises a Pearson product moment correlation test.
 6. The method of claim 1, wherein the multivariate statistical analysis comprises a principal component analysis process.
 7. The method of claim 1, wherein the multivariate statistical analysis comprises an orthogonal-partial least squares-discriminate analysis.
 8. The method of claim 1, wherein subjecting the mixture to a selective spectroscopy process comprises optimizing the duration of an excitation pulse during the spectroscopy analysis to maximize sensitivity.
 9. The method of claim 8, wherein the duration of the excitation pulse is about 5 to about 40 ms.
 10. The method of claim 1, wherein subjecting the mixture to a selective spectroscopy process comprises optimizing a mixing pulse during the spectroscopy analysis.
 11. The method of claim 1, further comprising classifying the individual spectral peaks by applying a frequency-domain multiplex scheme.
 12. The method of claim 11, wherein the frequency-domain multiplex scheme comprises a Hadamard Transform NMR matrix.
 13. A method for quantifying one or more chemical species within a complex mixture, comprising: subjecting a first mixture to a total correlation spectroscopy analysis to produce a first spectrum composed of individual spectral peaks representative of the one or more chemical species within the first mixture; acquiring a second spectrum from an isolated standard sample from a second mixture, the second spectrum being produced by subjecting the second mixture to a second total correlation spectroscopy analysis; and comparing the first spectrum to the second spectrum to quantify the one or more chemical species within the first mixture.
 14. The method of claim 13, further comprising subjecting the individual spectral peaks of the first mixture to a multivariate statistical analysis.
 15. The method of claim 13, further comprising optimizing the duration of an excitation pulse during the spectroscopy analysis of the first mixture to maximize sensitivity.
 16. The method of claim 13, wherein the first mixture comprises a biofluid mixture.
 17. The method of claim 13, wherein the multivariate statistical analysis comprises a Pearson product moment correlation test.
 18. The method of claim 13, wherein the multivariate statistical analysis comprises a principal component analysis process.
 19. The method of claim 13, wherein the multivariate statistical analysis comprises an orthogonal-partial least squares-discriminate analysis.
 20. The method of claim 15, wherein the duration of the excitation pulse is about 5 to about 40 ms.
 21. The method of claim 13, further comprising classifying the individual spectral peaks by applying a frequency-domain multiplex scheme.
 22. The method of claim 21, wherein the frequency-domain multiplex scheme comprises a Hadamard Transform NMR matrix.
 23. The method of claim 13, further comprising using the multivariate statistical analysis to determine purity of the individual spectral peaks and to assign the individual spectral peaks into spin systems.
 24. The method of claim 13, further comprising differentiating the complex mixture by analyzing the concentrations of the chemical species quantified during the total correlation spectroscopy analysis.
 25. The method of claim 13, further comprising optimizing a mixing pulse during the spectroscopy analysis of the first mixture.
 26. A method for quantifying one or more chemical species within a complex mixture, comprising: subjecting the mixture to a first total correlation spectroscopy analysis to produce a first spectrum composed of individual spectral peaks representative of the one or more chemical species within the mixture; acquiring a second spectrum from an isolated standard sample within the mixture, the second spectrum being produced by subjecting the mixture to a second total correlation spectroscopy analysis; and comparing the first spectrum to the second spectrum to quantify the one or more chemical species within the mixture.
 27. The method of claim 26, further comprising subjecting the individual spectral peaks of the mixture to a multivariate statistical analysis.
 28. The method of claim 27, wherein the multivariate statistical analysis comprises a Pearson product moment correlation test.
 29. The method of claim 26, wherein the mixture comprises a biofluid mixture. 